Qdrant
Overview
Qdrant is a cloud native vector database for building generative AI applications. It includes the ability to store arbitrary payloads along with the vector embeddings to create innovative AI driven applications. See https://qdrant.com for more information. Qarbine supports native Qdrant vector query interactions. For example, the specification below
{
index : "test_collection",
query: {
vector : [0.2, 0.1, 0.9, 0.7],
limit : 3
}
}
can return the answer set below. The structure of the first element is shown to the right.
The vector argument above can be an explicit array of numbers or be dynamically determined using Qarbine’s integration with many popular Gen AI services. Legacy analytic and reporting tools were never designed to support vector queries and vector database features. As a result, popular SQL centric tools are unable to leverage Qdrant’s features. In contrast, Qarbine is built for native Qdrant interactions and data interactions end to end.
Besides the standard Qdrant query interaction shown above, Qarbine also provides an optional and very convenient SQL “look alike” interface to retrieve Qdrant data. This interaction is described in more detail below. The underlying queries are, of course, still native Qdrant in nature as are the returned answer sets with elements such as those shown above.
Qarbine can perform a Qdrant vector query and then easily analyze the data and format an analysis report. This interaction can also be embedded into applications for a seamless end user experience. This avoids painfully navigating application silos and losing context along the way.
There is no flattening of data such as is required by legacy SQL oriented tools with nested data structures. Qdrant has no native SQL interface anyway, so even that burdensome approach is not available. Unlike legacy tools, with Qarbine the Qdrant payload can be a deeply nested data structure. It is left as-is for optimal data analytics.
Defining a Data Source
Overview
A Data Source is a Qarbine component responsible for retrieving data from somewhere. At a high level it has a name, a description and some arbitrary query string which when sent to the associated Qarbine Data Service endpoint returns some data. The overall execution flow for an analysis, including the optional prompt component, is shown below.
A single data source can be referenced by name from multiple Qarbine template components. This enables a single point of change when perhaps, an index is added, or some other query tweak is necessary. The alternative is to attempt to find all templates impacted by a schema or index change for example. That can be a daunting task across an organization. This component reusability is especially beneficial when team members have varying roles and skills.
Example Retrieval
The vector value in the query can be a bit bulky to manage and is also likely not going to be hardcoded. When Qarbine is embedded into an application, the application can supply the vector and make use of a variable placeholder in the query definition. A sample definition using this approach is shown below.
{
index : "my_books",
query: {
limit: 2,
vector: @vector
}
}
The book sample data is from https://qdrant.tech/documentation/tutorials/search-beginners/. It uses the all-MiniLM-L6-v2 encoder. For more information on the model see
Qarbine can be configured to access various generative AI services including those from Open AI, Microsoft Azure Open AI, and AWS Bedrock. Rather than directly setting the vector containing many numbers in the retrieval definition, you can have Qarbine obtain the vector value via the nearText phrase argument. This assumes the dimensional model for querying matches your index’s data vector producer! Below is an example Data Source definition using this approach,
{
index : "my_books",
nearText: "cyber space",
query: {
limit: 2
}
}
The specification below retrieves the first 5 similar books based on the nearText and useAssistant options.
{
"index": "my_books",
"nearText": "cyber space",
"useAssistant": "myHuggingFace",
"query": {
"with_vector": false,
"with_payload": true,
"limit": 5
}
}
The Qarbine SQL equivalent is shown below.
select * from my_books
where embedding using myHuggingFace near 'cyber space'
limit 5
Sample results are shown below.
The primary data if interest is within the payload field. Qarbine supports answer set manipulation via pragmas. We can adjust the Qarbine data source definition as shown below.
This results in the following much simpler answer set.
In analysis templates cell formulas to access the summary value can instead of using #payload.description, can use #description. This also reduces the size of the returned answer set. Other pragma manipulations are possible as well. See the Data Source Designer documentation for more details.
Rather than having the hardcoded above of ‘nearText: "computer space" ’ we can use a variable in the definition as well. Placeholders in the definition can be identified using “@variableName” or “[! macroExpression !]” syntax. For example,
{
index : "my_books",
nearText: @userInput,
useAssistant: "myHuggingFace",
query: {
limit: 3
}
}
Running this will first present a prompt for the userInput variable.
Enter the text to locate similar movies for.
Click OK.
The query definition’s nearText field is set to “pirates” and the updated definition sent to the Qarbine backend. Qarbine retrieves the vector for “pirates” and sets that into the Qdrant query’s vector field argument. That query is sent to Qdrant and the results sent back to the Qarbine backend. The ‘#pragma pullFieldsUp payload’ post processing then occurs. The final data is sent back to the Qarbine front end and shown. Below is an example result.
This style of end to end integration makes leveraging Qdrant ’s generative AI and data features extremely easy for everyone- not just developers! Developer applications can rapidly gain Qdrant benefits and roll the capabilities out to end users. End users gain in-app analytic functionality for the underlying Qdrant data.
Managing Answer Set Size
The default maximum number of rows starts off at 25 for a new data source. This is useful to evolve a query from a concept to one that you have verified returns the desired answer set. As noted, any native way of limiting an answer set size is the preferred approach. This setting is in the component dialog as shown below and also accessible by clicking the ‘Gear’ icon.
Once you are done drafting you can adjust this parameter. A “0” indicates there is no maximum. A number greater than 0 indicates to limit the final answer set size to that number of rows. This answer set truncation comes after any native query limit. So, if the answer set from the data endpoint is quite large, that content has to be returned to the Qarbine host. It then may truncate the number of rows. It is best to truncate at the query level (i.e., use a limit) to reduce the content sent from the data endpoint to the Qarbine host in the first place.
Adjusting the Maximum Rows
Recall the default maximum rows at the component level is 25. When you are satisfied with your query you can change that setting by clicking.
Adjust the setting to “0” indicating no Qarbine answer set truncation.
Click
Prompt Integration
Overview
Qarbine prompts provide a way to obtain runtime values and variables for data source and template execution. To avoid hardcoding, prompts can use macro formulas to run queries which populate list widgets. Prompts are defined in a no code manner using the Prompt Designer. Shown below is the execution flow when there is a Prompt component.
The Prompt Designer supports a large variety of input widgets including entry fields, check boxes, radio button groups, sliders, and file input.
Example
Let’s define a Qarbine Prompt component to obtain the userInput variable value to apply in the Data Source. This will soon be leveraged from a Qarbine Template. The Prompt Designer is basically a no-code dialog builder. In this example we are only asking the user for a single value. Qarbine prompts can ask for many values and present entry fields, lists, checkboxes, radio buttons and other widgets.
The running prompt is shown below.
The Qarbine prompt component has 2 elements.
The first element is defined as
Notice the image URL can be a macro language expression and not just a simple string. The second element is defined as
The component is saved in the Qarbine catalog and can be referenced by data sources and analysis templates.
Defining an Analysis Template
Overview
A template defines how to process the data being retrieved from Data Source queries and other data expressions. It also defines formulas, formatting options, and other analysis and presentation options. The overall execution flow for an analysis, including the optional prompt component, is shown below
Qarbine provides an extremely large set of formatting and data interaction functionality to produce publication quality, interactive analytics. There are over 450 Excel-like macro functions to apply when defining templates. These include aggregation functions such as max, min, avg, sum, count, etc.
Using the Template Designer
The Template Designer tool integrates features leveraging Microsoft Word formatting, Excel formulas, and PowerPoint layout concepts. The template defines how to iterate over the retrieved data, apply formulas, and present the results. The results can be publication quality reports with interactive end user options as well.
The result of running the about to be described template is shown below.
It presents books from the sample my_books index based on an end user provided description. This sample data has just a few payload fields and none of them have nested content. Qarbine handles any data shape, even very dynamic ones all within the same answer set.
The template’s primary properties are shown below.
. . .
It uses the Data Source defined previously. It references the Prompt as shown below.
The general cell layout is shown below.
The right hand side of the Template Designer will show any meta data about the data source data. (There must be no cell chosen in the grid area for this to appear).
This is a fairly simple template. The first body line uses a 14 point, bold Arial font for all of its cells. The ‘#’ prefix indicates a field of the active element. Any ‘@’ prefix refers to a variable. This template does not use any variables. However, consider a sales report using a payload with a quantity and a price field. A body cell could define variable ‘cost’ as “set cost = #quantity * #price’. There could then be a summary cell formula ‘= sum(@value)’ which would accumulate the costs of every product in the report.
Running this Prompt first presents the dialog into which the user types into the text area.
Clicking OK propagates that variable value into the template execution flow. As described above, the “cyber space” userInput value flows to the Qarbine backend. Qarbine retrieves the vector for the “cyber space” text from Hugging Face’s all-MiniLM-L6-v2 sentence-transformer and sets that into the Qdrant query’s vector field argument. The Qdrant query is sent to the Qdrant database and the results are sent back to the Qarbine backend. This data is then processed based on the template definition. The result is then shown to the user.
Query by Example and Report by Example
Qarbine has over 10 integrated tools which enable modern database analytic reporting for distributed teams. The Query by Example (QBE) and Report by Example (RBE) tools enable the user to type in simplified criteria and have Qarbine automatically generate the full query.
QBE and RBE are aware of equal and not equal. In addition to specify ‘should’, then have the value prefixed by a ‘˜’. A sample ‘should equal red’ cell criteria value is ‘= ˜red’. Strings with spaces must be quoted. For example, "˜red stuff".
Below is an example of a variety of the supported constructs,
The resulting query specification automatically generated by Qarbine is shown below
select city,color,rating
from test_collection2
where city = '~London' AND color = 'red' AND rating < 8
AND nearVector(0.2,0.1,0.9,0.7)
limit 25
The effective low level specification is shown below via AL-Run click.
The user does not need to know the low level Qdrant query syntax at all. Qarbine handles the query generation! For more detailed control use the Data Source Designer.
You may also use ‘BETWEEN x AND y’ criteria. If the first value starts with a ‘˜’ then the Qdrant ‘should’ list is used. Per Qdrant, these range comparisons are always numeric. An example is shown below.
The resulting query specification automatically generated by Qarbine is shown below
select city,color,rating
from test_collection2
where rating >= 5
AND rating <= 7 AND nearVector(0.2,0.1,0.9,0.7)
The effective low level specification is seen by pressing Alt-Run.
For direct use of Qdrant’s set of filtering features, use Qarbine’s Data Source Designer.
A _nearText or _nearVector value is required by the QBE and RBE tools. The table below lists the possible values for _nearText and comments.
Possible Value | Description |
---|---|
phrase | The phrase is sent to the default Qarbine AI Assistant service to obtain the vector which is subsequently used by the database. |
+alias phrase | The phrase is sent to the Qarbine AI Assistant specified by the given alias to obtain the vector embedding list which is subsequently used in the Qdrant database query. |
The table below lists the possible values for _nearVector and comments.
_nearVector Value | Description |
---|---|
list of numbers in brackets | The embedding size must match the database’s expectations. |
Below is an example of using this QBE cell syntax syntax.
To obtain the EXPLAIN information in Data Source Designer, QBE or RBE hold down the ALT key when clicking the run button.
Troubleshooting
- The size of the query’s vector must match that of the index data being searched.= otherwise Qdrant returns an error.
Next Steps
Accessing Your Database
To configure access to your database see the guides at
Querying Your Database
For database specific interaction guides navigate to
References
See https://qdrant.tech/documentation/ for more information.
See https://qdrant.tech/documentation/cloud/for information about Qdrant cloud.